A dynamically structured matrix population model for insect life histories observed under variable environmental conditions

Various environmental drivers influence life processes of insect vectors that transmit human disease. Life histories observed under experimental conditions can reveal such complex links; however, designing informative experiments for insects is challenging. Furthermore, inferences obtained under controlled conditions often extrapolate poorly to field conditions. Here, we introduce a pseudo-stage-structured population dynamics model to describe insect development as a renewal process with variable rates. The model permits representing realistic life stage durations under constant and variable environmental conditions. Using the model, we demonstrate how random environmental variations result in fluctuating development rates and affect stage duration. We apply the model to infer environmental dependencies from the life history observations of two common disease vectors, the southern (Culex quinquefasciatus) and northern (Culex pipiens) house mosquito. We identify photoperiod, in addition to temperature, as pivotal in regulating larva stage duration, and find that carefully timed life history observations under semi-field conditions accurately predict insect development throughout the year. The approach we describe augments existing methods of life table design and analysis, and contributes to the development of large-scale climate- and environment-driven population dynamics models for important disease vectors.

Life tables provide valuable insights into the environmental dependence of many species 1 . They have been used to inform species conservation 2 and vector and pest control 3,4 . By providing the biological foundations, they have become essential components of mathematical models that predict population dynamics and disease risk [5][6][7][8] .
A common practice in constructing life tables is to follow a cohort of organisms under controlled conditions and catalogue the number of individuals at each development stage together with a set of processes, such as mortality and fertility 1 . Several statistical methods have been derived to infer realistic development time distributions and survival rates from life history observations 9 . Some of the early pioneers assumed normal or gamma distributions for stage durations and a time-varying survival probability [9][10][11] . For many insect vectors, including mosquitoes, ticks, and small biting flies, the assessment of viability during development can be challenging 12,13 . Population heterogeneity, as an added complexity, may arise from the overlap of subsequent development stages. Experimental interventions, e.g. separating individuals or development stages, to improve observations may interfere with the natural processes 14,15 . On the other hand, most insect species are strongly affected by external temperature variations as they lack the ability to regulate their body temperature 16 . Attempts to establish controlled experimental conditions are known to result in inferences which may not be applicable in the field under diurnal variability 17,18 .
Understanding the environmental dependencies of population dynamics thus presents an inverse problem, where observations inform mathematical models (as opposed to the forward problem of finding a solution to a model) 19 . Inverse problems involve mechanistic models, which represent a simplified version of the reality. To provide accurate insights into environment-driven population dynamics, models should ideally capture as much www.nature.com/scientificreports/ of the complexities of the biological processes, environmental drivers, and experimental methods 20 . Population heterogeneity, time delay of development, and variable environmental conditions could be accounted for in an ideal model. Finally, the computational demand of the mathematical framework should enable fast optimisation in a multi-dimensional parameter space 21,22 . Several modelling frameworks have been employed for environment-driven population dynamics of insect vectors that transmit human disease. Some of the most popular ones are based on ordinary differential equations (ODEs) 6,[23][24][25][26] . Despite their prevalence and relative ease of use, the canonical ODE framework poorly represents population heterogeneity and time-delay caused by development 27,28 . Extensions have been made to incorporate realistic development time distributions by incorporating a series of sub-stages with identical exponentially distributed dwell times (also known as the linear chain trick) 27,29,30 . Delay differential equations (DDEs) have also been used to model insect population dynamics with true time delays in development [31][32][33] . Although the framework implies a homogenous cohort, heterogeneity can be introduced at the beginning through variable entry times. Recently, the framework has been extended towards representing trait variation under environmental change in successive development stages and generations 34 .
Alternatively, discrete-time matrix population models (MPM) have been used to represent populations structured with respect to age 35 or physiological stage 36 . MPMs make use of carefully designed projection matrices and matrix algebra to project population structure from one census date to the next 37,38 . Integral projection models (IPMs) 39 and the age-and stage-structured model of Caswell et al. 40 accommodate populations structured by a combination of age and a physiological trait, such as body size or wing length. Extensions have been made to accommodate a variable environment based on detailed age-trait observations [41][42][43] .
The age-structured population dynamics model of Erguler (sPop) is an MPM incorporating a dynamic age structure 44 . As specified by the model, individuals in a development stage are grouped into distinct age classes, and the population structure is propagated using hazard functions to generate the desired distributions of survival and development time. However, the effect of temperature on development is not exclusively age-dependent for many insect species, but it involves accomplishing a series of tasks through complex biochemical reactions, enabled by the heat absorbed [45][46][47] .
Here, we extend this framework by considering development as a cumulative process, where accumulation of a sufficient number of (intangible) units triggers stage completion. An analogy can be drawn to a mathematical concept called renewal processes 48 , where hypothetical events occur (arrive) randomly in time and iterate a process counter. By representing development as a renewal process, we extend the MPM framework with a dynamic state vector and projection matrix. As a result, the framework permits variations in both the number of pseudo-development stages and stage transition rates, and reproduce common development time distributions under the influence of environmental variation. To inform models based on this framework, we propose an experimental design of life table construction under variable semi-field conditions, which permits the inference of many environmental dependencies at once. We demonstrate this approach on two common disease vectors, the southern and northern house mosquito, Culex quinquefasciatus and Culex pipiens, respectively, and compare the information content of constant and variable-condition life table experiments. Methods sPop2: a dynamically-structured matrix population model. We represent development as a renewal processes 48 with a dynamic probability of event arrival. While each renewal event corresponds to a fraction of development, individuals with identical development fractions are grouped together to form a pseudo-stagestructured population.
During a single time unit (time step), we assume that the arrival time of each event is a nonnegative independent identically distributed random variable. Therefore, the number of independent events arriving in a time step is a discrete random variable, i, with a cumulative density function F(x, θ) = Pr(i ≤ x|θ) , where θ represents the expected time between successive events (interarrival time). The fraction of development achieved with each event is 1/k, where k is the target number of events to complete development. Thus, the rate of accumulation-the fraction of development completed-per time step is given by i/k.
Under the specific restrictions of invariable k and deterministic dynamics, this framework is equivalent to an MPM. To expand from the matrix population framework, we introduced a development indicator, q, which is used to structure the population; i.e. individuals with identical q values are grouped together to form a pseudostage. A single indicator value of q = 0 at the beginning of a simulation implies a cohort of individuals; however, a pre-structured initial population is also allowed. According to the framework, at each time step, q is incremented by i/k, and development occurs when q ≥ 1.
We developed Algorithm 1 to facilitate the dynamic handling of the pseudo-stages, allow for intrinsic stochasticity, and accommodate variable interarrival times. We implemented the algorithm in C and distributed it under the GPL 3.0 license on the GitHub repository https:// doi. org/ 10. 5281/ zenodo. 57883 77 as an extension to the age-and stage-structured population dynamics model of Erguler 44 (sPop2 v.2.1).
Algorithm 1 simulates a single step of development in a deterministic or stochastic setting. The development scheme is selected by setting the cumulative density function, F(x, θ) , according to (i) I(x ≥ θ) for fixed duration, (ii) 1 − θ x+1 for Pascal-distributed, and (iii) Ŵ(x + 1, 1/θ)/x! for Erlang-distributed development time. In this context, I is the indicator function and Ŵ(x, θ) is the upper incomplete gamma function. We present the detailed derivation of the three schemes in Development time distributions. To accomplish dynamic population restructuring in one pass, we employ the hazard function, H(x, θ) , i.e. the probability of x events arriving conditional on the absence of fewer events, www.nature.com/scientificreports/ where Scheme selection is followed by iteration through the existing pseudo-stages. Here, we adopt the notation q = q 0 , q 1 , . . . to represent the array of pseudo-stage classes and n = {n 0 , n 1 , . . . } the individuals assigned to each class. At each step, each q is expanded into an array of pseudo-stage classes, q ′ , as a result of incrementing q by i/k for every possible value of i. An appropriate fraction of the sub-population associated with q is allocated to each class in q ′ either deterministically or stochastically (given the probability of i). Once the smallest value of i such that q ′ ≥ 1 is reached, the remaining individuals, m, are flagged to complete development, and the algorithm moves forward to the next q. As a result, for each q with an allocated population size of n, the algorithm generates one or more pseudo-stages ( q ′ ) among which the population ( n ′ ) is distributed accordingly.
It is important to note that the probability of i leading to q > 1 in a time step may become significant especially with a relatively large arrival rate. Such overshooting will not prevent a population from completing development as required; however, may create bias should the process be followed by another stage of development. Overshooting can be minimised by extending the loop above q = 1 to maintain a record of all individuals with q > 1 . In turn, the ones completing development form a structured population to be fed into the next stage of development. However, there may be cases where q > 2 in one step, which require multiple successive stages to utilise the accumulated development. Consistent overshooting is a clear indication of a coarse time scale. However, due to the indirect time dependence of the renewal process, reducing the step size involves a change in the desired duration and rate of these processes (see Shifting the unit of time).
(1) www.nature.com/scientificreports/ Simulating development with renewal processes under variable conditions. We refer to development represented by renewal processes as cumulative development. To demonstrate how a change in rate manifests in cumulative development and how alternative representations capture the process, we constructed a scenario where a group of individuals is subjected to two different development rates. We employed a population of 100 individuals with a deterministic Erlang-distributed development time of 40 ± 5 steps (a mean of 40 and a standard deviation of 5 steps corresponding to k = 64 and θ = 0.625 ). At the 20 th step, we switched development time to 20 ± 5 steps ( k = 16 and θ = 1.25 ) and continued simulation until all individuals complete the process. We constructed four alternative models to compare with cumulative development. The first alternative is an age-structured MPM constructed using the sPop package 44 . The sPop algorithm dynamically structures a population with respect to the time spent in development, and the proportion of individuals completing development in a step ( d(τ ) ) is given by a gamma distribution with mean µ and standard deviation σ . We note that we use t to represent time in continuous domain and τ in discrete domain in this context. When we let x τ represent the size of a population at step τ, where f (τ ) = γ (k, τ/θ)/ Ŵ(k) , k = µ 2 /σ 2 , and θ = σ 2 /µ.
We employed the stage-structured DDE framework developed by Nisbet, Gurney, and Lawton as the second alternative 31,32 . By using the stagePop package written in R by Kettle and Nutter 49 , we represented the process with a variable development duration function, and a rate of recruitment function, We used R(t) to construct a population history with a gamma-distributed age-structure between t = −40 and t = 0 to result in stage completion at t = 40 ± 5 . Since this is an artificial construct, we employed modified development rates for favourable (1/40) and unfavourable (1/60) conditions to yield approximately 20 ± 5 and 40 ± 5 units of development time, respectively. The R code implementing this model is given in The variable-rate development model in R..
The third alternative is an ordinary differential equation (ODE) of the form where is the reciprocal of average development time (shifting from 0.025 to 0.05 at t = 20). Finally, we introduced time dependence to the ODE by applying the linear chain trick (LCT) 27,29 , which results in where γ = 1/θ.

Transformation of environmental variation in development.
To investigate how frequent changes in the environment may affect development, and expand on the previous section, we assumed that there is a non-linear relationship between temperature and development rate. We employed the Briere-1 function, which is widely used to model development rate in insects 50 , where f r (T) is the reciprocal of development time in days, T is temperature in degree Celsius, T L and T H are lower and higher temperature thresholds, and α is a scaling constant.
We conjectured a population of 100 individuals, with a deterministic Erlang-distributed (cumulative) development time, and defined the distribution with a variable mean ( µ ) and a fixed standard deviation of 5 steps. We let µ be the reciprocal of f r (T) , linking mean development time to temperature.
Next, we assumed that the population develops under a variable temperature regime, ρ , which is randomly sampled at each time step from one of the three alternative Gaussian distributions: (25,4) , and (iii) ρ H ∼ N (35,4) , where N(µ, σ ) indicates a Gaussian distribution with mean µ and standard deviation σ , and ∼ defines the probability distribution of a random number. www.nature.com/scientificreports/ Population dynamics model of environment-driven insect development. We represented mosquito development with a generic 4-stage population dynamics model where eggs develop into larvae, then pupae, and emerge as adults. We assumed that egg, larva, and pupa development times are Erlang-distributed 9,10 , the mean of which are linked with ambient temperature using generic response functions, and assumed timeindependent daily survival for each stage (see The population dynamics model in C for detailed implementation). We linked mean development time ( µ ) with temperature (T) with the widely used Briere-1 function (Eq. 9), where T 0 and T 1 are temperature thresholds ( T 1 > T 0 ), and α s is a scale parameter. Since each dependency adds 3 parameters to the model, we assumed that the standard deviation of the distribution is proportional to its mean, i.e. σ = α m µ , to limit the total number of parameters. Furthermore, we employed a quartic (fourth degree) equation 7,51 to represent the daily mortality rate, p m , of each stage as a function of temperature, where p 0 is the lower boundary of mortality, T m is the optimum temperature, and α s is a scale parameter. We clipped p m between 0 and 1 to represent a fraction. Overall, the generic model has 24 parameters, values of which will be inferred using life history data.
In addition to temperature, we assumed that development of insect larva may also depend on photoperiod 52-54 , and employed a generic sigmoidal relationship, where P represents day length at the corresponding latitude and date, α p is the threshold, and α s and α q are scale parameters. We concluded this extended version with a total of 27 parameters. An illustrative C code implementing both the generic and extended models is given in The population dynamics model in C.
Further extensions to this model to represent the complete life cycle with egg laying is straightforward. Although we included an example model in the sPop2 v.2.1 package, we plan more in depth investigation of complete life cycles in future studies.
Inverse modelling with approximate Bayesian computation. We followed the inverse modelling procedure described in Erguler et al. 55 , which involves proposing a generic model with exploratory environmental dependencies to be calibrated with observations using Bayesian principles. Accordingly, we propose a model, estimate an optimum parameter configuration based on an observational dataset, and sample a set of alternative configurations around the optimum using approximate Bayesian computation (ABC) 56 . To arrive at the optimum, we employed the hoppMCMC (v1.1) parameter optimisation and posterior sampling algorithm, which uses an adaptive basin-hopping Markov chain Monte Carlo (MCMC) method 57 .
We assumed a uniform prior, Pr(θ) , and replaced the likelihood function 58 , Pr(δ|θ) , with a simulation-based distance (score) function, f (δ, y(θ)) , where δ represents observations and θ is the parameter configuration of the model. A score function is a measure of similarity between δ and the model output, y(θ) (with parameter θ ). We employed the negative logarithm of the Poisson probability as the score function (SS), where δ t is the observation at time t and y t (θ) is the corresponding simulation using θ . According to this, the posterior is approximated with a given threshold ε, and it tends to its true value as ε → 0 . Despite the use of approximation, exploring the entire posterior distribution is computationally demanding, particularly when the number of parameters is large. To reduce the computational demand, we generated partial posterior samples around the optimum, i.e. sampled from the posterior mode, 8 60,61 . To calibrate the generic model against these experiments, we employed the parameter optimisation/posterior mode sampling procedure with three replicates from each temperature regime. Following the experimental protocol, we initiated cohorts with 750 larvae and simulated development until adult emergence. As a result, we obtained 100 parameter samples from a posterior mode we labeled as q at ε = 2500. To evaluate development under variable conditions, we designed life   table experiments for Cx. pipiens biotype molestus under variable semi-field conditions at the premises of the University of Novi Sad in Petrovaradin, Serbia. We introduced 16-to-24 h old egg rafts of the local laboratory colony in three Bioquip© mosquito breeders, set and kept outside in the semiurban (house garden) environ- www.nature.com/scientificreports/ ment, from 1 February to 31 December 2017. We recorded hatching, transition of instars through the larval stages (L1 to L4), pupation, and adult eclosion time every day at 8 am local time. We added a pinch of Tetramine™ fish food after observing larvae hatching from the eggs. We obtained five-minute temperature readings of the ambient air from the weather station close to the breeders (1 m). New egg rafts were introduced after all larvae/ pupae died or adults emerged, and a total of 16 experiments were performed (Fig. S4). We note that the initial conditions of the experiment shown in Fig. S4(b) were missing, and we estimated the number of eggs with a Poisson probability, assuming 30% mortality according to Spanoudis et al. 62 . We followed the inverse modelling procedure and used seven of these experiments-one from (a), (b), (d), one from (f), one from (g), (i), and (j) shown in Fig. S4-to obtain 100 parameter samples from the optimum posterior mode, p ( ε = 1600 ), for the extended population dynamics model. We show, in Fig. 1, that our dynamic pseudo-stage-structured MPM yields a gradual stage completion with an average development time of approximately 30 ± 5 steps (solid dark lines) when conditions shift at τ = 20 (each step corresponds to 1 time unit). The target Erlang-distributed development trajectories without the shift are shown as dashed gray lines. The snapshots of the population structure, represented by the development indicator q, taken at each time step, show that half of the development is complete at the time of the switch and the switch accelerates the accumulation of q (Fig. S1).

Semi-field life table experiments.
In age-dependent development, a sharp transition, instead of a gradual one, is observed at the 20 th step (Fig. 1a). The switch results in the majority of individuals reaching target development age immediately at the time of switch. Previous work, reported in Erguler et al. 59 and Erguler et al. 55 , aimed at modelling population dynamics under variable conditions, based on this dynamic age-dependent framework. Our results suggest that cumulative development might improve the fit to the data, prediction accuracy, and applicable geospatial range of these models.
We see in Fig. 1b that the canonical ODE framework represents an exponentially distributed development time and a shift in rate at t = 20 . The LCT extension to the framework helps to incorporate time dependence and represent the long and short development time distributions (Fig. 1c). The resulting model accommodates change in the rate parameter γ (Eq. 8), e.g doubling of γ changes development time from 40 ± 5 to 20 ± 2.5 . However, to accommodate the required shift, the model needs to be transformed from a 66-dimensional system to an 18-dimensional one, which is beyond the scope of this work. We argue that in cases where development time distribution is fixed a priori (excluded from model calibration), the LCT framework provides a significant advantage over canonical ODEs. Although the framework has been used in the field of infectious disease epidemiology 64,65 , it has recently been applied to the modelling of vector population dynamics 30 .
The DDE framework also yields a gradual development trajectory with an intermediate duration (Fig. 1d). However, the distribution tends towards the longer development trajectory compared to the one achieved with cumulative development. The canonical DDE framework assumes a homogenous cohort, where all individuals react in the same way to variations in development rate. The assumption gives rise to sharp stage transitions within a single generation if all individuals are introduced at the same time. As a potential workaround, it has been proposed to generate a plausible population history, through variable entry times, until the required (or observed) developmental variation builds up 31,32 . Variation in development rates then acts upon the population and results in the modification of the existing age-structure. It is worthwhile to mention that a recent extension to the DDE framework to accommodate trait variation in population dynamics 34 might also accommodate changing development rates within a single stage; however, it has not yet been employed at this scale.
Cumulative development is in agreement with the widely known degree-day (DD) framework, where development time is predicted by the heat accumulating in organisms 46 . Although the rate of accumulation in response to environmental conditions varies considerably, the DD framework implies that the combination of two different rates yields an average development time (also seen with cumulative development in Fig. 1). Experimental evaluation of this will be the topic of future research.
It is worth mentioning that our dynamically structured renewal process-based MPM follows the assumption of random population heterogeneity 9,11 ; namely, at the individual level, the future behaviour of an organism is not affected by its historical behaviour. However, trait variation within a population is prevalent in many species, and www.nature.com/scientificreports/ is known to impact population dynamics and species interactions 34,66,67 . Future development of our framework will consider improving upon this limitation.

Environmental variation transformed into development times. Several non-linear relationships
have been proposed to represent the temperature dependence of insect development 68 . A common feature is the presence of low and high temperature thresholds beyond which development is prohibitively slow. Often, there exists an optimum between the thresholds where the process is most efficient. A typical relationship between temperature and development rate, reported in Briere et al. 50 , is seen in Fig. 2a. Mean development time, given by the reciprocal of rate in Fig. 2b, exhibits the two thresholds and the optimum. To investigate how temperature variation is transformed into cumulative development time, we assumed three variation regimes at relatively low, medium, and high temperatures ( ρ L , ρ M , and ρ H , respectively). Densities of the corresponding Gaussian probability distributions are plotted in Fig. 2b. Accordingly, each variation is transformed by a slightly different region of the rate function (Eq. 9). Eventually, the three development time distributions emerge as shown in Fig. 2c.
We found that the output of ρ H is skewed towards longer durations compared to what we would otherwise obtain if we simulated the process under constant conditions with the mean of ρ H . The impact of variation in the middle range, ρ M , is similar to that of ρ H , but less pronounced. Conversely, the output of ρ L is skewed towards shorter durations. Our results suggest that, when development is already highly efficient, variation in temperature causes frequent encounters of longer (but not shorter) development durations, eventually extending the overall duration of the process. In the low efficiency range, development takes long to complete, but frequent encounters of relatively short durations-especially as the process approaches its optimum duration-triggers completion earlier than in the case of no variation. www.nature.com/scientificreports/ Overall, our model predictions are in agreement with the rate summation effect, which states that the different outcomes obtained under constant and varying temperatures is due to the non-linear relationship between temperature and development rate (the Kaufmann effect) 16 . Furthermore, acceleration of development in insects subjected to varying high temperatures, its retardation at varying low temperatures, and low variability of development time in the linear range of the rate curve have been widely discussed 69  We note that the variation in development times is due to temperature since we ignore intrinsic stochasticity to demonstrate the impact of ρ in isolation. The deterministic setup removes the upper limit in the number of distinct pseudo-stage indicators: a different q emerges from each k, and a different k emerges from each ρ . Since the number of pseudo-stages quickly exhausts the computational resources, we set the precision of q to the nearest 100 th decimal point, effectively capping the number of pseudo-stages at 100 (see Accuracy of the pseudo-stage approximation). As shown in Fig. S2, the approximation has a negligible impact on accuracy.

Environmental dependency extracted from life tables under constant conditions. Having dis-
cussed the importance of environmental variability in development, in this section, we employ a well-established experimental method to unravel the relationship between temperature and development time in a common mosquito species. In contrast to invasive vectors, which effectively render new territories suitable for disease transmission, Culex species pose an imminent threat with their wide distribution and ornitophilic (Cx. pipiens biotype pipiens), mamophilic (Cx. pipiens biotype molestus), and intermixed (their hybrids) blood feeding behaviour. Here, we investigate the temperature dependencies of mortality and development of Cx. quinquefasciatus, the southern house mosquito, which is an important disease vector, widely distributed across the tropics and sub-tropics 73,74 .
To infer the dependencies, we used a generic temperature-driven insect development model, described in Methods, and the life history observations performed at five constant temperatures (15,20,23,27, and 30 • C) under laboratory conditions 60,61 . As a result of the inverse modelling procedure, detailed in Methods, we found that the generic model yields an overall match between the simulations and observations. In Fig. 3a, we present a comparison of observed and simulated maximum production and the stage-emergence times for pupae and adults. Here, we define the stage-emergence time as the time taken from the beginning of an experiment to the time when half of the maximum production of a stage (pupa or adult) is observed. In addition, in Fig. S3, we present the comparison of time trajectories separately for each temperature.
We found that the generic model faithfully replicates the observed development times of larvae and pupae. On the other hand, stage mortalities are predicted well at three temperatures, but are overestimated at 20 or 27 • C. The impact of temperature on mortality might be more complex than it is captured by the quartic equation (Eq. 11). Optimum survival seen at 27 • C suggests that the relationship might be non-symmetrical or multimodal. www.nature.com/scientificreports/ In addition, the observed variability in mortality suggests that the mismatch could also be due to experimental error or the intrinsic stochasticity of the biological processes. We extracted the functional forms of temperature dependence from the posterior samples, shown in Fig. 3b, c, and found that the data inform the model as expected within the temperature range of the experiments ( 15−30 • C). Stage durations are well informed, and reflect the low variability seen in the data (the standard deviation is less than 1.5 days at all temperatures for both stages). Accordingly, pupae develop in less than 4 days, which is much shorter than the larva development time (between 10 and 20 days above 20 • C). The model predicts that the minimum temperature at which development occurs (from the larva stage) is 10.5 • C, which is close to 10.9 • C, reported in Grech et al. 75 .
The observed variability in pupa and adult production suggests that survival is a highly stochastic process regardless of the controlled laboratory conditions. A deterministic model, such as the one used in this context, represents the mean of such processes but does not capture their variability. The simulated variability is a result of the uncertainty in parameter estimates. Model parameters contribute unequally to the output as a result of the model structure and the functional forms of temperature dependence, and the data inform certain parameters better than others 76,77 . For instance, daily mortality, shown in Fig. 3c, is more constrained for larva than pupa, which is likely due to the short duration of the pupa stage-changes in daily mortality have larger consequences as development time increases.
We note that a well-informed model yields predictions in the form of verifiable hypotheses; however, these are not necessarily accurate predictions. Model accuracy is assessed when such hypotheses are experimentally tested as part of the cyclic process of model development 78 . Here, we demonstrated that our modelling framework can be used to derive biologically meaningful inferences and to help improve the understanding of the temperature dependence of Cx. quinquefasciatus.
Greater information content of semi-field experiments. The number of experiments required to test a range of conditions, including different combinations of multiple drivers, may quickly exhaust available resources. Moreover, variable conditions may have a previously unaccounted impact on development and mortality. In this section, we demonstrate that observations performed under variable conditions are valuable sources of information for our modelling framework, which is capable of representing the dynamics under such conditions.
Cx. pipiens, the northern house mosquito, is a competent disease vector, widely distributed across the temperate countries in North America, Europe, Asia, and North and East Africa 74,79 . Unlike Cx. quinquefasciatus, Cx. pipiens biotype pipiens is known to enter a reproductive diapause phase, where adult females arrest oogenesis during harsh winter conditions 80,81 . When larvae are exposed to short photoperiods and low temperatures during development, they emerge as adults destined to diapause. Although Cx. pipiens biotype molestus has lost the ability to diapause, its immature stages have been reported to retain metabolic sensitivity to photoperiod 82,83 .
To reveal the environmental dependence of the molestus biotype, we exposed its eggs to variable temperatures in semi-field conditions until adult emergence (or loss of cohort). The numbers of viable larvae, pupae, and adults observed in different experimental batches are given in Fig. S4. We employed the extended model with both temperature and photoperiod dependence (see Methods), and calibrated the model against seven of the semi-field experiments, performed in March, May, June, July, August, and September ( Fig. S4(a), (b), (d), (f), (g), (i) and (j)).
As a result, we found that the model replicates the patterns of abundance emerging in the observations, e.g. stage timing and maximum adult production, reasonably well in most of the experiments, regardless of the times during which they were performed (Figs. S5 and S6). Quantitative evaluation of the agreement reveals that the observed and simulated adult emergence times are less than a week apart (Table 1).
On the other hand, we found that egg and larva mortalities, and also, pupa and adult production are highly variable in the observations (see Fig. S4(c), (f), and (g)). Spikes of larva mortality are seen in Spring and Autumn (especially in May, September, and October). Despite this variability, the difference between the predicted and www.nature.com/scientificreports/ observed adult production was around 11 or less, except in the case of the experiment E7, which unexpectedly yielded only one pupa and no adults. We obtain relatively large mismatches when predicting larva abundances, specifically where egg mortality is not predicted well (E5, E7, E8, E10, E11, E12). We hypothesise that the stress associated with rearing labgrown specimens under variable conditions might elevate egg mortality, induce premature hatching, or affect the survival of the larvae produced. Since egg development starts inside gravid females, i.e. under the optimum conditions of the laboratory, the observable part of development subjected to variable conditions remains mainly the hatching behaviour. Consequently, we observed rapid and synchronous completion of the egg stage in all experiments (see Figs. S5 and S6). Being exposed to a narrow range of temperatures, relatively less information can be obtained on the environmental dependency of the egg stage. As a potential improvement, we recommend that future adaptations of the semi-field experiments consider using field-captured adult female mosquitoes as the source of eggs.
In addition to egg mortality, we observed spikes of larva mortality in May (E3), July (E8), and in Autumn (E14, E15, and E16). A likely cause of such transient high mortality is brief temperature shifts towards the extremes. However, the rarity of such events prevents the inverse modelling procedure from adequately capturing their impacts on life processes. As a potential improvement, we recommend that the experiments are performed in overlapping time frames, increasing the likelihood of observing the impact of an extreme event at different times during development. We note that the early decline in larva abundance seen in Autumn could be a result of insufficient food supply due to the increased nutritional requirements. According to the proposed metabolic response to short photoperiods, larvae would require additional food to accumulate fat reserves in preparation for diapause, the state where adult females endure several months without feeding. This implies that development takes longer than it would at long photoperiods when subjected to similar temperature regimes.
Using the extended model and the semi-field data, we identified the environmental dependencies shown in Fig. 4. The data informed about the temperature dependency of each life stage as well as the photoperiod dependency of larvae. As expected, the overall variability in the inferred dependencies is higher for Cx. pipiens compared to Cx. quinquefasciatus (Fig. 3). We found that the larva and pupa development times closely match the observations reported by Spanoudis et al. 62 at long photoperiods (see Fig. S7). However, the development times reported in Kiarie-Makara et al. 84 at short photoperiods and moderate temperatures do not suggest a significant impact of daylight, which could be due to the particular strain of Cx. pipiens used in these experiments. As expected, the temperature dependency of egg development was not well informed by the data in the current configuration of the model and the functional forms of environmental dependence.
We found that the photoperiod dependency is significantly non-linear with an average threshold of 13.7 hours of daylight (Fig. 4c). Photoperiod-driven extension in development time (about 1.7 times more at 13:11 h L:D than at 15:9 h L:D) contributes to improving the accuracy of predictions at the end of the high season (Fig. S8). The critical photoperiod (CPP) agrees well with the ones identified for Cx. pipiens biotype pipiens 85,86 . For instance, Sanburg and Larsen reported that there is an exponential relationship between follicle sizes in adult females (signifying commitment to diapause) and the photoperiods they were exposed to during immature stages 85 . We inferred a similar (but reverse) gradient between photoperiod and the extension of larva development time from 15 to 12 hours of daylight (Fig. 4c). Cx. pipiens over the calendar year by setting up a hypothetical experiment at the beginning of each week. We simulated the subsequent development dynamics and obtained the annual development profile as shown in Fig. 5. Accordingly, the immature stages begin development from late February and the first adults emerge in May (adults emerging late in May start developing in the experiments set up late in March). The profile is consistent with the regular Cx. pipiens high season in the region. As seen in Fig. 5, predicted adult emergence times agree well with the observations throughout the high season. However, there is a greater variation in the maximum number of adults than the times of emergence (extending to almost 40% of the possible outcomes in early August). A greater variability (almost 80% in August) is seen in the corresponding observations, which we transformed into the percentage of eggs emerging as adults (where available) to facilitate comparison. According to the model, variation in adult production is associated with the variation in both development times and mortality during immature stages. We recall that the uncertainty in the informed environmental dependencies is high around relatively less frequently encountered values-especially the lower and higher temperature extremes (Fig. 4). Specifically, egg development times cannot be identified precisely, but immediate hatching of the larvae is predicted between 20 and 25 °C. Consequently, we found that frequent exposure to temperatures outside the well-informed range have a significant impact on the variation in adult production (Fig. S9).
We adopt the time of first adult emergence as a proxy of the first generation of adults in the season. According to our model, early adult emergence is a result of shorter development times and higher success rates, which indicates that the temperature conditions allow for an early first generation of adults. An early first generation greatly contributes to an early peak of adult abundance, which may increase the risk of vector-borne disease transmission in humans. For instance, an early peak of abundance may cause an early start of West Nile virus circulation and amplification in Culex pipiens and their avian hosts, which increases the likelihood of virus spillover to humans 51,87 . Anecdotal evidence shows that the anomalously hot April and May that occurred in 2018 in Serbia shifted the peak of Cx. pipiens abundance forward by more than one month (Petrić et al., unpublished).  www.nature.com/scientificreports/ Similarly, 2018 was the year with the largest number of autochthonous West Nile virus infections throughout Europe (more than the total of the previous seven years together) 88,89 . In summary, our results showed that the semi-field experiments, when used in combination with our dynamic pseudo-stage-structured MPM, help to develop predictive models and inform over a wide range of environmental conditions. We developed a predictive model of Cx. pipiens biotype molestus development and gained insights into the specifics of temperature and photoperiod dependencies by reducing the need of extensive laboratory data. We used life history observations from 7 experiments performed under semi-field conditions and employed a generic model structure, largely uninformed on the specific environmental dependencies of the species. The cumulative development framework we introduced applies broadly to poikilotherms subjected to highly variable environmental conditions. Although the generic model structure helps to develop exploratory models and identify potential environmental dependencies, accuracy can be improved by customising the models for the known dependencies of particular species. With a straightforward extension of the development model to cover the complete life cycle (with egg laying and density dependence), it is possible to incorporate field observations of eggs or adult mosquitoes, and develop an environment-driven population dynamics model.

Conclusions
There is an urgent need to unravel the intricate physiological links between mosquitoes and their environment to quantify the impact of climate warming and control the future spread of disease. Here, we described an inverse modelling approach combining a pseudo-stage-structured population dynamics model and a semi-field design of life table experiments. The model allows for variability in development rate during the process, and is suitable for representing insect life cycles, subjected to highly variable environmental conditions. The combination can be used to accurately characterise a wide range of external drivers without the need to collect large amounts of data. Consequently, our approach complements the analytical and experimental methods needed for developing predictive large-scale climate-driven models for many insect species, such as those important for disease transmission, species conservation, and forensic investigation.